|
A document-oriented database or document store is a computer program designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data. Document-oriented databases are one of the main categories of NoSQL databases and the popularity of the term "document-oriented database" has grown〔(DB-Engines Ranking per database model category )〕 with the use of the term NoSQL itself. Document-oriented databases are inherently a subclass of the key-value store, another NoSQL database concept. The difference lies in the way the data is processed; in a key-value store the data is considered to be inherently opaque to the database, whereas a document-oriented system relies on internal structure in the ''document'' order to extract metadata that the database engine uses for further optimization. Although the difference is often moot due to tools in the systems, conceptually the document-store is designed to offer a richer experience with modern programming techniques. XML databases are a specific subclass of document-oriented databases that are optimized to extract their metadata from XML documents. Document databases contrast strongly with the traditional relational database (RDB). Relational databases are strongly typed during database creation, and store repeated data in separate ''table''s that are defined by the programmer. In an RDB, every instance of data has the same format as every other, and changing that format is generally difficult. Document databases get their type information from the data itself, normally store all related information together, and allow every instance of data to be different from any other. This makes them more flexible in dealing with change and optional values, maps more easily into program objects, and often reduces database size. This makes them attractive for programming modern web applications, which are subject to continual change in place, and speed of deployment is an important issue. == Documents == The central concept of a document-oriented database are the ''documents'', which is used in usual English sense of a group of data that encodes some sort of user-readable information. This contrasts with the ''value'' in the key-value store, which is assumed to be opaque data. The basic concept that makes a database document-oriented as opposed to key-value is the idea that the documents include internal structure, or metadata, that the database engine can use to further automate the storage and provide more value. To understand the difference, consider this text document: Bob Smith 123 Back St. Boys, AR, 32225 US Although it is clear to the reader that this document contains the address for a contact, there is no information within the document that indicates that, nor information on what the individual fields represent. This file could be stored in a key-value store, but the semantic content that this is an address may be lost, and the database has no way to know how to optimize or index this data by itself. For instance, there is no way for the database to know that "AR" is the state and add it to an index of states, it is simply a piece of data in a string that also includes the city and zip code. It is possible to add additional logic to deconstruct the string into fields, to extract the state by looking for the middle item of three comma separated values in the 3rd line, but this is not a simple task. For instance, if another line is added to the address, adding a PO Box or suite number for instance, the state information is in the 4th line instead of 3rd. Without additional information, parsing free form data of this sort can be complex. Now consider the same document marked up in pseudo-XML: In this case, the document includes both data and the metadata explaining each of the fields. A key-value store receiving this document would simply store it. In the case of a document-store, the system understands that contact documents may have a state field, allowing the programmer to "find all the Now consider a slightly more complex example: In this case a number of the fields are either repeated or split out into separate containers in the case of . With similar hints, the document store will allow searches for things like "find all my This is another major advantage of the document-oriented concept; a single database can contain both of these In addition to making it easier to handle different types of data, the metadata also allows the document format to be changed at any time without affecting the existing records. If one wishes to add an The usefulness of this sort of introspection of the data is not lost on the designers of other database systems. Many key-value stores include some or all of the functionality of dedicated from the start document stores, and a number of relational databases, notably PostgreSQL and Informix, have added functionality to make these sorts of operations possible. It is not the ability to provide these functions that define the document-orientation, but the ease with which these functions can be implemented and used; a document-oriented database is designed from the start to work with complex documents, and will (hopefully) make it easier to access this functionality than a system where this was added after the fact. Practically any "document" containing metadata can be managed in this fashion, and common examples include XML, YAML, JSON, and BSON. Some document-oriented databases include functionality to help map data lacking clearly defined metadata. For instance, many engines include functionality to index PDF or TeX documents, or may include predefined document formats that are in turn based on XML, like MathML, JATS or DocBook. Some allow documents to be mapped onto a more suitable format using a schema language such as DTD, XSD, Relax NG, or Schematron. Others may include tools to map enterprise data, like column-delimited text files, into formats that can be read more easily by the database engine. Still others take the opposite route, and are dedicated to one type of data format, JSON. JSON is widely used in online programming for interactive web pages and mobile apps, and a niche has appeared for document stores dedicated to efficiently handling them. Some of the most popular Web sites are document databases, including the many collections of articles at (pubmed.gov ) or major journal publishers; Wikipedia and its kin; and even search engines (though many of those store links to indexed documents, rather than the full documents themselves). 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Document-oriented database」の詳細全文を読む スポンサード リンク
|